Search CORE

96 research outputs found

Incremental learning of skills in a task-parameterized Gaussian Mixture Model

Author: BD Argall
Carme Torras
D Kulic
Flavio Prieto
Guillem Alenyà
Jose Hoyos
M Pardowitz
WT Townsend
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The final publication is available at link.springer.comProgramming by demonstration techniques facilitate the programming of robots. Some of them allow the generalization of tasks through parameters, although they require new training when trajectories different from the ones used to estimate the model need to be added. One of the ways to re-train a robot is by incremental learning, which supplies additional information of the task and does not require teaching the whole task again. The present study proposes three techniques to add trajectories to a previously estimated task-parameterized Gaussian mixture model. The first technique estimates a new model by accumulating the new trajectory and the set of trajectories generated using the previous model. The second technique permits adding to the parameters of the existent model those obtained for the new trajectories. The third one updates the model parameters by running a modified version of the Expectation-Maximization algorithm, with the information of the new trajectories. The techniques were evaluated in a simulated task and a real one, and they showed better performance than that of the existent model.Peer ReviewedPostprint (author's final draft

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Learning and Composing Primitive Skills for Dual-Arm Manipulation

Author: A Billard
A Gams
A Gams
AJ Ijspeert
BD Argall
C Smith
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/05/2019
Field of study

In an attempt to confer robots with complex manipulation capabilities, dual-arm anthropomorphic systems have become an important research topic in the robotics community. Most approaches in the literature rely upon a great understanding of the dynamics underlying the system's behaviour and yet offer limited autonomous generalisation capabilities. To address these limitations, this work proposes a modelisation for dual-arm manipulators based on dynamic movement primitives laying in two orthogonal spaces. The modularity and learning capabilities of this model are leveraged to formulate a novel end-to-end learning-based framework which (i) learns a library of primitive skills from human demonstrations, and (ii) composes such knowledge simultaneously and sequentially to confront novel scenarios. The feasibility of the proposal is evaluated by teaching the iCub humanoid the basic skills to succeed on simulated dual-arm pick-and-place tasks. The results suggest the learning and generalisation capabilities of the proposed framework extend to autonomously conduct undemonstrated dual-arm manipulation tasks.Comment: Annual Conference Towards Autonomous Robotic Systems (TAROS19

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

User evaluation of an interactive learning framework for single-arm and dual-arm robots

Author: A Colomé
A Edsinger
A Jevtić
BD Argall
C Torras
C-A Smarr
CC Kemp
H Robinson
I Leite
J-J Cabibihan
M Nicolescu
S Mitra
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The final publication is available at link.springer.comSocial robots are expected to adapt to their users and, like their human counterparts, learn from the interaction. In our previous work, we proposed an interactive learning framework that enables a user to intervene and modify a segment of the robot arm trajectory. The framework uses gesture teleoperation and reinforcement learning to learn new motions. In the current work, we compared the user experience with the proposed framework implemented on the single-arm and dual-arm Barrett’s 7-DOF WAM robots equipped with a Microsoft Kinect camera for user tracking and gesture recognition. User performance and workload were measured in a series of trials with two groups of 6 participants using two robot settings in different order for counterbalancing. The experimental results showed that, for the same task, users required less time and produced shorter robot trajectories with the single-arm robot than with the dual-arm robot. The results also showed that the users who performed the task with the single-arm robot first experienced considerably less workload in performing the task with the dual-arm robot while achieving a higher task success rate in a shorter time.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Deep active learning for autonomous navigation.

Author: AY Ng
BD Argall
D Silver
I Noda
J Kober
P Abbeel
R Bemelmans
S Calinon
S Karakovskiy
S Schaal
V Mnih
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/08/2016
Field of study

Imitation learning refers to an agent's ability to mimic a desired behavior by learning from observations. A major challenge facing learning from demonstrations is to represent the demonstrations in a manner that is adequate for learning and efficient for real time decisions. Creating feature representations is especially challenging when extracted from high dimensional visual data. In this paper, we present a method for imitation learning from raw visual data. The proposed method is applied to a popular imitation learning domain that is relevant to a variety of real life applications; namely navigation. To create a training set, a teacher uses an optimal policy to perform a navigation task, and the actions taken are recorded along with visual footage from the first person perspective. Features are automatically extracted and used to learn a policy that mimics the teacher via a deep convolutional neural network. A trained agent can then predict an action to perform based on the scene it finds itself in. This method is generic, and the network is trained without knowledge of the task, targets or environment in which it is acting. Another common challenge in imitation learning is generalizing a policy over unseen situation in training data. To address this challenge, the learned policy is subsequently improved by employing active learning. While the agent is executing a task, it can query the teacher for the correct action to take in situations where it has low confidence. The active samples are added to the training set and used to update the initial policy. The proposed approach is demonstrated on 4 different tasks in a 3D simulated environment. The experiments show that an agent can effectively perform imitation learning from raw visual data for navigation tasks and that active learning can significantly improve the initial policy using a small number of samples. The simulated test bed facilitates reproduction of these results and comparison with other approaches

Crossref

Open Access Institutional Repository at Robert Gordon University

Robots that can adapt like animals

Author: A Borji
A Fuchs
A Pouget
A Sproewitz
B Siciliano
BD Argall
CE Rasmussen
DJ Christensen
DM Wolpert
E Broadbent
G Antonelli
J Bongard
J Carlson
J Kluger
J Kober
J Mockus
K Nagatani
K Sanderson
KP Körding
M Blanke
M Ito
M Santello
RR Murphy
S Benson-Amram
S Derégnaucourt
S Grillner
S Thrun
SL Jarvis
U Wagner
V Verma
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/05/2015
Field of study

As robots leave the controlled environments of factories to autonomously function in more complex, natural environments, they will have to respond to the inevitable fact that they will become damaged. However, while animals can quickly adapt to a wide variety of injuries, current robots cannot "think outside the box" to find a compensatory behavior when damaged: they are limited to their pre-specified self-sensing abilities, can diagnose only anticipated failure modes, and require a pre-programmed contingency plan for every type of potential damage, an impracticality for complex robots. Here we introduce an intelligent trial and error algorithm that allows robots to adapt to damage in less than two minutes, without requiring self-diagnosis or pre-specified contingency plans. Before deployment, a robot exploits a novel algorithm to create a detailed map of the space of high-performing behaviors: This map represents the robot's intuitions about what behaviors it can perform and their value. If the robot is damaged, it uses these intuitions to guide a trial-and-error learning algorithm that conducts intelligent experiments to rapidly discover a compensatory behavior that works in spite of the damage. Experiments reveal successful adaptations for a legged robot injured in five different ways, including damaged, broken, and missing legs, and for a robotic arm with joints broken in 14 different ways. This new technique will enable more robust, effective, autonomous robots, and suggests principles that animals may use to adapt to injury

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

From Demonstrations to Task-Space Specifications:Using Causal Analysis to Extract Rule Parameterization from Demonstrations

Author: A Jain
BD Argall
C Wirth
D Angelov
DP Bertsekas
I Higgins
J Pearl
J Peters
JM Matari’c
KH Koch
M Ghallab
ME Mortenson
P Shaw
RH Byrd
S Zhifei
VF Nikolaj van Omme
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/06/2020
Field of study

Learning models of user behaviour is an important problem that is broadly applicable across many application domains requiring human-robot interaction. In this work, we show that it is possible to learn generative models for distinct user behavioural types, extracted from human demonstrations, by enforcing clustering of preferred task solutions within the latent space. We use these models to differentiate between user types and to find cases with overlapping solutions. Moreover, we can alter an initially guessed solution to satisfy the preferences that constitute a particular user type by backpropagating through the learned differentiable models. An advantage of structuring generative models in this way is that we can extract causal relationships between symbols that might form part of the user's specification of the task, as manifested in the demonstrations. We further parameterize these specifications through constraint optimization in order to find a safety envelope under which motion planning can be performed. We show that the proposed method is capable of correctly distinguishing between three user types, who differ in degrees of cautiousness in their motion, while performing the task of moving objects with a kinesthetically driven robot in a tabletop environment. Our method successfully identifies the correct type, within the specified time, in 99% [97.8 - 99.8] of the cases, which outperforms an IRL baseline. We also show that our proposed method correctly changes a default trajectory to one satisfying a particular user specification even with unseen objects. The resulting trajectory is shown to be directly implementable on a PR2 humanoid robot completing the same task.Comment: arXiv admin note: substantial text overlap with arXiv:1903.0126

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Using informative behavior to increase engagement while learning from human reward

Author: AL Thomaz
AL Thomaz
BD Argall
DL Gill
DP Bertsekas
E Bouwers
E Lawrence
ED Demaine
F Kaplan
Guangliang Li
Hayley Hung
I Szita
R Maclin
R Sutton
S Chernova
Shimon Whiteson
W. Bradley Knox
WB Knox
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In this work, we address a relatively unexplored aspect of designing agents that learn from human reward. We investigate how an agent’s non-task behavior can affect a human trainer’s training and agent learning. We use the TAMER framework, which facilitates the training of agents by human-generated reward signals, i.e., judgements of the quality of the agent’s actions, as the foundation for our investigation. Then, starting from the premise that the interaction between the agent and the trainer should be bi-directional, we propose two new training interfaces to increase a human trainer’s active involvement in the training process and thereby improve the agent’s task performance. One provides information on the agent’s uncertainty which is a metric calculated as data coverage, the other on its performance. Our results from a 51-subject user study show that these interfaces can induce the trainers to train longer and give more feedback. The agent’s performance, however, increases only in response to the addition of performance-oriented information, not by sharing uncertainty levels. These results suggest that the organizational maxim about human behavior, “you get what you measure”—i.e., sharing metrics with people causes them to focus on optimizing those metrics while de-emphasizing other objectives—also applies to the training of agents. Using principle component analysis, we show how trainers in the two conditions train agents differently. In addition, by simulating the influence of the agent’s uncertainty–informative behavior on a human’s training behavior, we show that trainers could be distracted by the agent sharing its uncertainty levels about its actions, giving poor feedback for the sake of reducing the agent’s uncertainty without improving the agent’s performance

DSpace@MIT

Crossref

Springer - Publisher Connector

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Spatiotemporal Coordination Supports a Sense of Commitment in Human-Robot Interaction

Author: A Clodic
A Sciutti
A Sciutti
A Vignolo
BD Argall
C Breazeal
F Rea
G Metta
J Michael
J Michael
J Michael
J Michael
K Fischer
KM Woosnam
T Iqbal
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

In the current study, we presented participants with videos in which a humanoid robot (iCub) and a human agent were tidying up by moving toys from a table into a container. In the High Coordination condition, the two agents worked together in a coordinated manner, with the human picking up the toys and passing them to the robot. In the Low Coordination condition, they worked in parallel without coordinating. Participants were asked to imagine themselves in the position of the human agent and to respond to a battery of questions to probe the extent to which they felt committed to the joint action. While we did not observe a main effect of our coordination manipulation, the results do reveal that participants who perceived a higher degree of coordination also indicated a greater sense of commitment to the joint action. Moreover, the results show that participants’ sensitivity to the coordination manipulation was contingent on their prior attitudes towards the robot: participants in the High Coordination condition reported a greater sense of commitment than participants in the Low Coordination condition, except among those participants who were a priori least inclined to experience a close sense of relationship with the robot

Crossref

Stirling Online Research Repository (RIOXX)

Stirling Online Research Repository

An assigned responsibility system for robotic teleoperation control

Author: A Stentz
B Sellner
B Shneiderman
B Zion
BD Argall
C Miller
CA Miller
DB Kaber
E Deelman
E Ohn-Bar
E Sacerdoti
F Ge
J Ding
J Minguez
JO Kephart
K Lee
M Ghallab
MR Endsley
MR Endsley
MS Prewett
R Parasuraman
R Parasuraman
RA Grier
S Kohlbrecher
S Manschitz
S Thrun
SG Hart
SJ Yi
TB Sheridan
TB Sheridan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

This paper proposes an architecture that explores a gap in the spectrum of existing strategies for robot control mode switching in adjustable autonomy. In situations where the environment is reasonably known and/or predictable, pre-planning these control changes could relieve robot operators of the additional task of deciding when and how to switch. Such a strategy provides a clear division of labour between the automation and the human operator(s) before the job even begins, allowing for individual responsibilities to be known ahead of time, limiting confusion and allowing rest breaks to be planned. Assigned Responsibility is a new form of adjustable autonomy-based teleoperation that allows the selective inclusion of automated control elements at key stages of a robot operation plan’s execution. Progression through these stages is controlled by automatic goal accomplishment tracking. An implementation is evaluated through engineering tests and a usability study, demonstrating the viability of this approach and offering insight into its potential applications

Lirias

Crossref

Deakin Research Online

Nottingham Trent Institutional Repository (IRep)

Research Repository